Recurrent neural network grammars (RNNG) are a recently proposedprobabilistic generative modeling family for natural language. They showstate-of-the-art language modeling and parsing performance. We investigate whatinformation they learn, from a linguistic perspective, through variousablations to the model and the data, and by augmenting the model with anattention mechanism (GA-RNNG) to enable closer inspection. We find thatexplicit modeling of composition is crucial for achieving the best performance.Through the attention mechanism, we find that headedness plays a central rolein phrasal representation (with the model's latent attention largely agreeingwith predictions made by hand-crafted head rules, albeit with some importantdifferences). By training grammars without nonterminal labels, we find thatphrasal representations depend minimally on nonterminals, providing support forthe endocentricity hypothesis.
展开▼